-
Notifications
You must be signed in to change notification settings - Fork 3k
[0.13] Core: Fix filter pushdown for metadata tables with evolved specs (#4520) #4569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[0.13] Core: Fix filter pushdown for metadata tables with evolved specs (#4520) #4569
Conversation
2ab0361 to
7cb2b6e
Compare
|
Yes, at first glance I do think they both should be backported. The release manager is @nastra fyi, but I personally think if people are encountering this and it’s an obvious bug, we backport it. I’d personally also like to see #3411 backported, as I’ve recently encountered issues on partition summaries using the void transform (which could be unrelated to both as I need to take a closer look): #4689 But if it’s ready I don’t see why not. Any ideas on a possible timeline for both? I know that’s highly dependent on things you can’t control, but we’ve admittedly delayed 0.13.2 for quite some time. TLDR - On first pass I would think to include both. But I do have some concerns around holding up the release too much. I’ll spend time on a proper pass over this tomorrow :) |
|
Hi @kbendick yea I requested the backport for #3411 already and @ConeyLiu did this in : #4572 and I merged it. For this one, I think @rdblue was ok and marked the original issue #4520 for 0.13.2 backport by the tag, so I did the backport pr by convenience if it can be merged. (by the way the problem with the https://github.com/apache/iceberg/milestone/18 tag is , a lot of github issues are marked 'closed' but aren't really backported). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the combination of #4520, the other PRs that were backported, and this, this seems good.
Given it's been some time and other things have been added, I'd suggest rebasing. I'd also consider porting the tests for the other Spark versions to this PR, just so we can ensure that they work. But barring that (or any disagreement with that), I'm +1 on this. The added Spark 3.2 tests are pretty comprehensive.
Thank you @szehon-ho!
7cb2b6e to
0ad1a29
Compare
|
Thanks @kbendick for the review, rebased. Re: Spark versions, the main fix is in core so I think the Spark side is agnostic (I added the Spark 3.2 test mainly as an end-to-end). Anyway it's the same fix as on the master branch to #4520, and I can look at adding some tests for other Spark versions over there. |
Backport of #4520 to 0.13.x branch, which returns sometimes wrong results for Files metadata table queries with filters, for V2 tables with specific partition spec evolutions involving dropping fields, and IndexOutOfBoundException in other cases where partition spec has changed.
Adapted to old code in the branch.
This works, except some test case fails in querying dropped partition fields in V1 tables because of related but different issue #3411 . So one option is to backport that one first for the test, though fix is orthogonal.